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(54) Conditional execution speed-up on synchronizing instructions 

(57) A method for conditional speed-up of execution of an instruction. sequence having synchronization instructions (e.g. in 
a computer system in which compatability with instruction sequences written for the Intel 80386 or earlier processors is 
desirable) involves replacement of a synchronization (WAIT) instruction with a null instruction in cases where the WAIT 
instruction is followed by a floating point instruction including a WAIT as an integral part thereof. 
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At least one drawing originally filed was informal and the print reproduced here is taken from a later filed formal copy. 
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1. Field of the invention. 
The present invention relates to the field of 

5 synchronization control in computer systems. 

2. Description of Related Art. 

The closest art known to the Applicant is embodied in 
the Intel 8 038 6*™ ('386^) microprocessor manufactured by 
10 Intel Corporation of Santa Clara, California. 

in general, computer systems utilizing the '386 
microprocessor embody a number of components, such as the 
■386 microprocessor, a math coprocessor (typically either 
the 80287 or 80387 numeric coprocessor) , etc. In computer 
15 systems utilizing the '386 processor, the general purpose 
microprocessor (i.e., the -38 6) and the math coprocessor 
are separate, discrete components. 

The architecture of the '386, as in many general 
purpose processors, includes synchronization instructions; 
20 such synchronization instruction allow for synchronization 
of processing between components in a computer system 
utilizing the '386. Further, synchronization instruction 
may provide means for initiating error checking. 
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For example, the WAIT instruction in the '386 
instruction set causes the '386 to wait execution until a 
numeric coprocessor (such as the 80287 or 80387) has 
finished a task. In general, the numeric coprocessor 
activates a BUSY pin. When the BUSY pin is active (brought 
low in the -386), the WAIT instruction suspends execution 
of «386 instructions until the BUSY pin is inactivated 

(brought high) . In this way, processing on the '386 
microprocessor may be suspended to guarantee that a numeric 

instruction being processed by the numeric coprocessor has 

completed execution. 

It is desired to develop a method for removing (or 
ignoring during execution) sychronization instructions from 
an instruction sequence in computer systems. 

Specifically, it is desired to develop a system for 
speeding up the execution of an instruction sequence by 
removing, under certain conditions, wait states created by 
synchronization instructions. 
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A method for speeding up the execution of an 
instruction sequence in a computer system implementing 
functions of a general purpose processing unit and a 
5 special purpose processing unit under common control is 
described. The method eliminates certain created by 
synchronization instructions. 

The method involves the steps of detecting that a 
synchronization instruction has been encountered during the 
10 execution of an instruction sequence and replacing or 

substituting for the synchronization instruction a null 
instruction. In the preferred embodiment, a one clock 
cycle no-operation instruction is utilized as the null 
instruction. 

15 The microprocessor of the preferred embodiment 

corpses an instruction prefetch unit for fetching 
instructions prior to the execution of a previous 
instruction. Such prefetch units ere utilized in computer 
systems to increase the operating speed of a computer 
20 system by ensuring a gueue of instructions is available for 
an instruction decode and instruction execution unit. 

in the present invention, after encountering a 
synchronisation instruction, but before execution of the 
synchronisation instruction, the next instruction ("second 
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instruction") is fetched by the prefetch unit. If the 
instruction is one of a predetermined set of instructions, 
the synchronization instruction is not executed and a null 
instruction is executed in its place. If the instruction 
5 is not one of the predetermined set of instructions, the 
synchronization instruction is executed. (In the preferred 
' embodiment, the second instruction may not be prefetched in 
certain circumstances for a variety of reasons. If the 
second instruction is not prefetched, the synchronization 
10 instruction is executed in the normal execution sequence.) 
In the preferred embodiment, many floating point 
instructions inherently provide for routine 
synchronization. The predetermined set of instructions 
comprises the set of such floating point instructions of 
the instruction set of the microprocessor of the preferred 
embodiment which inherently provide for routine 
synchronization; floating point instructions which do not 
inherently provide for such synchronization are not 
included in the predetermined set. Further, non-floating 
point instructions are not included in the predetermined 
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pFi TF | F EESCBIE I 1QM Q£ THF, DRAWINGS 

Figure 1 is a flow diagram illustrating a known method 
Elementing synchronizing control flow instructions. 



Figure 2 is a bloc* diagram of portions of a computer 
system of the present invention. 

Figure 3 is a flow diagram illustrating a method of 
10 the present invention. 
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A method for processing instructions in a computer 
system is described. In the following description, 
numerous specific details are set forth such as specific 
instructions, etc., in order to provide a thorough 
understanding of the present invention. It will be 
obvious, however, to one stilled in the art that the 
present invention may be practiced without these specific 
details. in other instances, well-known circuits, 
structures and techniques have not been shown in detail in 
order not to unnecessarily obscure the present invention. 
ffiZ ESmEH O F IHE-£BESEMI TNVKNTTON 
The preferred embodiment of the present invention is 
proposed for use in the next generation of microprocessor 
in the Intel 8086 family, commonly referred to as the 
80486tm microprocessor ('486**,, manufactured by Intel 
Corporation of Santa Clara, California. 

The proposed M86 microprocessor implements the 
functions of a general purpose microprocessor (such as the 
functions of the Intel 80386 microprocessor) and the 
functions of a numeric coprocessor (such as the Intel 80387 
numeric coprocessor, in a single component or "chip". 
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' It is desired to ensure that the M86 microprocessor 
i. capable of supporting instruction sequences written for . 
execution on the 80366, including instruction sequences 
which utilize a numeric coprocessor such as the 80387. 
5 such instruction sequences often include synchronization 
instructions for synchronizing execution of the B0386 and 
the 80387. 

,i| - mr TT-T T QE «"™« sY>irm»nNT7.MIgH 
Figure 1 is a flow diagram illustrating use of such 
l0 synchronization instructions in an instruction sequence. A 
ty pical instruction sequence may include a synchronization 
instruction which causes the main processor (e.g., the 
.,,«, to wait further processing until the numeric 
coprocessor is available to execute another instruction, 

15 block 101 and block 102. 

Typically, the synchronization instruction is a WAIT 
instruction. The synchronization instruction is placed in 
the instruction sequence to ensure the numeric coprocessor 
has completed execution of any prior floating point 
instruction before presenting the next floating point 
instruction for execution. <As discussed in the Background 
of the invention section, the 60387 asserts a BOS* signal 
while processing an instruction. After completing 
processing of the instruction, the 80387 deasserts the Busy 



20 



- a - 

pin. The 80386 will suspend execution during the time the 
BUSY pin is asserted if a WAIT instruction is executed.) 

It is worth noting that in many 80386/80387 
implementations, programmers do not have to code WAIT 
5 instructions in instruction sequences. Many assemblers for 
the 80386 will automatically encode the WAIT instructions 
in the instruction sequence. 

After the numeric coprocessor indicates it is 
available for processing of the next instruction (by 
10 deasserting the BUSY pin) , block 103, the next instruction 
in the instruction sequence is presented for execution, 
block 104. (Generally, the next instruction is a floating 
point instruction; however, in certain cases, it may be a 
non- floating point instruction.) 
15 Certain instructions in the '386 instruction set 

include a WAIT state as an integral part of the 
instruction. In such cases, branch 110, the main processor 
waits execution of the next instruction in the instruction 
sequence until the numeric processor signals it has 
20 completed processing, block 105 and block 106. Typically, 
floating point instructions include a WAIT state as an 
integral part of the instruction where the instruction will 
affect memory or registers which may also be effected by 
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instructions executing on the general purpose 

microprocessor. 

in other cases, the floating point instruction do not 
include a WAIT state as an integral part of the 
instruction. In such cases, the main processor does not 
wait for the numeric coprocessor to complete processing of 
the instruction, branch 111. As examples, this second type 
of instructions include the instructions listed in TABLE I, 
below: 

T_ABLE_1 

O pcode Function 

(1) FSTENV (Store the coprocessor's environment); 

(2) FSTCW (Store the coprocessor's control word); 

(3) FSAVE (Save the coprocessor's state); 

(4) FSTSW (Store the coprocessor's status word) ; 

(5) FCLEX (Clear the coprocessor's exception flags); 

(6) FINIT (initialize the coprocessor); and 

{7 ) FSETPM (Place the coprocessor in protected mode) . 
.(B) FENI (Enable interrupt) 
(9) FDIS (Disable interrupt) 

(Note that the codes listed in Table I are operation 
codes ("opcodes") in the '386 instruction set, not 
mneumonics • ) 
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In the case of either type of instruction, the main 
processor will begin execution of the next instruction in 
the instruction sequence at some point in time, block 107. 

Further information on the 80386/80387 processors may 
be found with reference to Chris H. Pappas £ William H. 
Murray, III, 8P^ 8 fi Microprocessor Handbook , Osborne 
McGraw-Hill, 1988. 

SSHEB&L RndRfi &BCH UECII2BE overview 
The proposed 80486 microprocessor comprises a general 
purpose microprocessor and a numeric coprocessor integrated 
on a single chip. The proposed '486 microprocessor further 
comprises instruction prefetch circuitry and instruction 
decode circuitry. 

p r «=>fgfrh Circuitry 
The prefetch circuitry is described with reference to 
Figure 2. A prefetch unit 202 is coupled with a bus 201. 
This allows the prefetch unit to fetch instructions for 
processing. The prefetch unit 202 is further coupled with 
an instruction decode unit 203. The instruction decode 
unit 203 is provided with instructions for decoding by the 
prefetch unit 202. Finally, the instruction decode unit 
203 is coupled with an execution unit 204 for providing 
microcoded instructions for execution. 
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The prefetch unit 202 requests an instruction and 
stores the instruction in a prefetch queue until the 
instruction decode 203 is available to process the 
instruction and translate' the instruction into microcode. 
An instruction queue in the instruction decode unit 203 
holds the microcoded instructions until they are executed 
by an execution unit 204. 

Be guixgnfini ; fas svnc.hroni ration inst r uct i ons 
jn tfrg p r gfprrsd Embodiment 
As discussed above, in the prior art it is known to 
include synchronization instructions in instruction 
sequences to ensure proper execution. The present 
invention teaches that in certain cases, synchronization 
instructions are not necessary for ensuring the proper 
15 execution of the instruction sequence. 

Specifically, the present invention teaches that 
synchronization instructions may not be required depending 
on the instruction immediately following a synchronization 
instruction. If the instruction immediately following the 
sychronization instruction includes a WAIT state as an 
integral part of the instruction, it has been observed that 
coding of a WAIT instruction previous to such an 
instruction is not necessary. 
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As one objective the present invention, a computer 
system is to be developed which ensures compat ability with 
instruction sequences written for systems such as the '386. 
As a second objective, it is desired to increase the 
5 performance of the computer system of the present invention 
by effectively removing certain synchronization 
instructions from such instruction sequences. 

It is worth noting that, in the system of the 
preferred embodiment, a WAIT instruction requires a minimum 
10 of 3 clock cycles to complete. As will be detailed below, 
the system of the present invention effectively removes 
from execution certain WAIT instructions and replaces the 
WAIT instruction with a null operation. In the preferred 
embodiment, the null operation requires 1 clock cycle for 
15 execution. 

T4RTHOD OF THF. PRE FERRED EMBODIMENT 

A method utilized by the preferred embodiment to 
improve execution time of an instruction sequence by 
removing selected synchronization instructions is described 
20 with reference to the flow diagram of Figure 3. 

In an instruction sequence, a synchronization 
instruction may be encountered, block 301. As stated 
previously, instruction sequences commonly utilize the WAIT 
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instruction of the instruction set for the SOBS family of 
ndcroprocessors as a synchronization instruction . 

P s described previously, the WAIT instruction is used 
„ cause the general purpose processor portion of the 
computer system to wait, or suspend, execution of 
instructions until the numeric processor portion of the 
computer system has finished a task. 

Th e system of the present invention comprises means 
£o r allowing the next instruction in the instruction 
sequence to he fetched and examined prior, to executing the 
next instruction in the instruction sequence, bloc* 202. 
xn the preferred embodiment, the prefetch unit waits until 
the bus is available and fetches the next instruction and 
stores it in a prefetch queue. In certain cases, such as 
„ he „ the bus is servicing higher priority requests, 
prefetching does not occur. 

Assuming the prefetch was successful, block 103, the 
present invention teaches determining whether the 
instruction fetched is one of a set of instruction for 
, which the previous synchronisation is not necessary. It 
h ss been determined that synchronization Instructions are 
not necessary in instruction sequences before certain 
so-called -safe- instructions. 
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In the preferred embodiment, these "safe" instructions 
include instructions to be executed by the numeric 
processor which include a WAIT state as an integral part of 
the instruction. (In general, in the instruction set of 
5 the processor of the preferred embodiment, these 

instruction are the floating point instructions, such as 
F2XM1, FABS, FADD/FADDP, FBLD, FBSTP , etc). 

The predetermined set of "safe" instructions in the 
preferred embodiment does not include certain floating 
10 point instructions which do not have synchronization built 
into the instruction. Examples of these instructions are 
listed with reference to TABLE I, above. Further, the 
predetermined set of safe instructions of the preferred 
embodiment does not include any instructions to be executed 
15 by the general purpose processor (e.g., non-floating point 
instructions such as LOOP, LSL, MOV, MUL, etc.). 

By way of example, the instruction sequence detailed 
in TABLE II may have been written for execution on a 
computer system utilizing an 80386: 
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j jctmrt-.ion * J ns f rat i on 

(1) WAIT 

(2) FINIT 

(3) WAIT 



(4) FILD Word Ptr (0006) 

(5) WAIT 

(6) FLDPI 

( 7) ' WAIT 

(6) FDIV ST,ST(1) 

Synchronization instructions numbered 3, 5 and 1 in 
TABLE II are not necessary for proper execution of the 
instruction sequence on a computer system embodying the 
. 486 processor. Therefore, the method of the present 
5 invention replaces the synchronization instructions. (WAIT 
instructions) with a null instruction, bloc* 305. The null 
instruction requires one clock cycle for execution as 
opposed to a minimum of three clock cycles for the WAIT 
instruction- 

10 synchronization instruction numbered 1 in TABLE II is 

required by the computer system of the present invention 
and, therefore, is not replaced by a null instruction. 
This synchronization instruction is required because the 
FINIT instruction does not include a WAIT state as an 

15 integral part of. the instruction. 

in general, in the system of the present invention, 
synchronization instructions preceding floating point 
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instructions which provide routine synchronization are 
replaced with a null instruction. The null instruction 
does not perform any operation; rather it only affects the 
(E) IP register (instruction pointer register) . 
5 In other cases, such as synchronization instructions 

preceding non-floating point instructions and 
synchronization instruction preceding floating point 
instructions not providing routine synchronization, the 
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synchronization instruction are executed in the normal 
course of the instruction sequence, block 306. 

Thus, a method for avoiding time penalties associated 
5 with synchronization instructions is described. Although 
the present invention has been described with specific 
reference to a number of details of the preferred 
embodiment, it will be obvious that a number of 
modifications and variations may be employed without 
10 departure from the scope and spirit of the present 
invention. Accordingly, all such variations and 
modifications are included within the intended scope of the 
invention as defined by the following claims. 



BNSDOCID: <GB 22301 19A l_» 



- 18 - 

CTAIMS 

1. A method for processing an instruction sequence in 
a computer system comprising the steps of: 

(a) encountering a synchronization instruction in said 

5 instruction sequence; 

(b) replacing said synchronization instruction with a 

null operation instruction. 

2. The method as recited by Claim 1 further 

10 comprising the steps: 

(c) fetching a second instruction in said instruction 
sequence subsequent to receiving said synchronization 
instruction; 

(d ) examining said second instruction to determine if 
said second instruction is one of a predetermined set of 
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instructions; 

(e) if said second instruction is one of said 
predetermined set of instructions, performing step (b) ; 

(f) if said second instruction is not one of said 
predetermined set of instructions, executing said 
synchronization instruction. 
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3. The method as recited by Claim 1 wherein said null 
operation instruction requires one clock cycle for 
execution. 

5 4. The method as recited by Claim 2 wherein said 

predetermined set of instructions comprises floating point 
instructions with routine synchronization. 



10 



5. The method as recited by Claim 4 wherein said 
synchronization instruction is a WAIT instruction. 



6. A method for processing an instruction sequence in 
a computer system, said instruction sequence comprising at 
least one synchronization instruction, said method 

15 comprising the steps of: 

(a) receiving said synchronization instruction; 

(b) fetching a second instruction; 

(c) initiating execution of a null instruction in 
place of said synchronization instruction under a first set 

20 of predetermine conditions. 

7. The method as recited by Claim 6 further 
comprising the step of initiating execution of said 
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synchronization instruction under a second set of 
predetermined conditions. 

8. The method as recited by Claim 6 wherein said 

5 first set of predetermined conditions comprises determining 
said second instruction is one of a predetermined set of 
instructions . 

9. The method as recited by Claim 8 wherein said 

10 predetermined set of instructions comprises floating point 
instructions with routine synchronization in an instruction 
set of said computer system. 

10. The method as recited by Claim 8 wherein said 
15 first set of predetermined conditions further comprising 

said second instruction was successfully fetched. 

11. The method as recited by Claim 7 wherein said 
second set of predetermined conditions comprises 

20 determining said second instruction was not successfully 
fetched or said second instruction is not one of a 
predetermined set of instruction. 
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12. In a computer system comprising a general purpose 
processing unit and a special purpose processing unit under 
common control; said computer system executing an 
instruction sequence, a method comprising the steps of: 

a) encountering a synchronization instruction in said 
instruction sequence; 

b) removing said sychronization instruction from said 
instruction sequence prior to the execution of said 
synchronization instruction. 

13. The method as recited by Claim 12 wherein said 
step (a) further comprises the step of determining if a 
second instruction, immediately subsequent to said 
synchronization instruction is one of a predetermined set 
of instructions and executing said step (b) if said second 
instruction is one of said predetemined set of 
instructions. 

14. The method as recited by Claim 13 wherein said 
synchronization instruction comprises an instruction for 
causing said general purpose processor to enter a wait 
state. 
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15. The method as recited by Claim 14 wherein a null 
instruction is executed in place of said synchronization 
instruction. 

5 16. The method as recited by Claim 15 wherein said 

null instruction requires one clock cycle for execution. 

17. The method as recited by Claim 16 wherein said 
synchronization instruction is a WAIT instruction . 

1B. A method for processing an instruction squence in a 
computer system substantially as hereinbefore Ascribed and illustrated 
io the accompanying drawings. 
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